%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%  Copyright by Wenliang Du.                                       %%
%%  This work is licensed under the Creative Commons                %%
%%  Attribution-NonCommercial-ShareAlike 4.0 International License. %%
%%  To view a copy of this license, visit                           %%
%%  http://creativecommons.org/licenses/by-nc-sa/4.0/.              %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\newcommand{\commonfolder}{../../common-files}

\input{\commonfolder/header}
\input{\commonfolder/copyright}


\lhead{\bfseries SEED Labs -- Pseudo Random Number Generation Lab}


\begin{document}


\begin{center}
{\LARGE Pseudo Random Number Generation Lab}
\end{center}

\seedlabcopyright{2018}



% *******************************************
% SECTION
% ******************************************* 
\section{Overview}

Generating random numbers is a quite common task in security software. 
In many cases, encryption keys are not provided by users, but are instead 
generated inside the software. Their randomness is extremely important;
otherwise, attackers can predict the encryption key, and thus defeat the 
purpose of encryption. Many developers know how to generate random
numbers (e.g. for Monte Carlo simulation)
from their prior experiences, so they use the similar methods
to generate the random numbers for security purpose. Unfortunately,
a sequence of random numbers may be good for Monte Carlo simulation, but
they may be bad for encryption keys. Developers need to know how to
generate secure random numbers, or they will make mistakes. Similar
mistakes have been made in some well-known products, including
Netscape and Kerberos. 

In this lab, students will learn why the typical 
random number generation method is not appropriate for generating 
secrets, such as encryption keys. They will further learn 
a standard way to generate pseudo random numbers that are good for security purposes.
This lab covers the following topics:

\begin{itemize}[noitemsep]
\item Pseudo random number generation
\item Mistakes in random number generation
\item Generating encryption key
\item The \texttt{/dev/random} and \texttt{/dev/urandom} device files  
\end{itemize}


\paragraph{Lab environment.}\seedenvironmentC


% *******************************************
% SECTION
% ******************************************* 
\section{Lab Tasks}



% -------------------------------------------
% SUBSECTION
% ------------------------------------------- 
\subsection{Task 1: Generate Encryption Key in a Wrong Way}

To generate good pseudo random numbers, we need to start with something
that is random; otherwise, the outcome will be quite predictable. 
The following program uses the current time as a seed for the 
pseudo random number generator.


\begin{lstlisting}[caption="Generating a 128-bit encryption key", label=enc:code:key_gen]
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define KEYSIZE 16

void main()
{
  int i;
  char key[KEYSIZE];

  printf("%lld\n", (long long) time(NULL));
  srand (time(NULL));       (*@\ding{192}@*)

  for (i = 0; i< KEYSIZE; i++){
     key[i] = rand()%256;
     printf("%.2x", (unsigned char)key[i]);
  }
  printf("\n");
}
\end{lstlisting}
 

The library function \texttt{time()} returns the time as the number of seconds since the Epoch,
\texttt{1970-01-01 00:00:00 +0000 (UTC)}. Run the code above, and describe your
observations. Then, comment out Line~\ding{192}, run the program again, and describe your
observations. Use the observations in both cases to explain the purpose of 
the \texttt{srand()} and \texttt{time()} functions in the code. 




% -------------------------------------------
% SUBSECTION
% ------------------------------------------- 
\subsection{Task 2: Guessing the Key}


On April 17, 2018, Alice finished her tax return, and she
saved the return (a PDF file) on her disk. To protect the file, she encrypted the PDF file 
using a key generated from the program described in Task 1. She wrote down the key
in a notebook, which is securely stored in a safe. 
A few month later, Bob broke into her computer and gets a copy of the encrypted 
tax return. Since Alice is CEO of a big company, this file is very valuable. 


Bob cannot get the encryption key, but by looking around Alice's computer, he 
saw the key-generation program, and suspected that Alice's encryption key may
be generated by the program. He also noticed the timestamp of the encrypted file, which
is \texttt{"2018-04-17 23:08:49"}. He guessed that the 
key may be generated within a two-hour window before the file was 
created. 


Since the file is a PDF file, which has a header. The 
beginning part of the header is always the version number. Around 
the time when the file was created, PDF-1.5 was the most common version,
i.e., the header starts with \texttt{\%PDF-1.5}, which 
is 8 bytes of data. The next 8 bytes of the data are quite easy to 
predict as well. Therefore, Bob easily got the first 16 bytes of the plaintext.
Based on the meta data of the encrypted file, he knows that 
the file is encrypted using \texttt{aes-128-cbc}. Since AES is a 128-bit cipher, the 16-byte
plaintext consists of one block of plaintext, so Bob knows a block of plaintext and its
matching ciphertext. Moreover, Bob also knows 
the Initial Vector (IV) from the encrypted file (IV is never encrypted).
Here is what Bob knows:


\begin{lstlisting}
Plaintext:  255044462d312e350a25d0d4c5d80a34
Ciphertext: d06bf9d0dab8e8ef880660d2af65aa82
IV:         09080706050403020100A2B2C2D2E2F2 
\end{lstlisting}
 

Your job is to help Bob find out Alice's encryption key, so you can decrypt the entire
document. You should write a program
to try all the possible keys. If the key was generated correctly, this task will not be
possible. However, since Alice used \texttt{time()} to seed her random number generator, you
should be able to find out her key easily. You can use the \texttt{date} command to print out
the number of seconds between
a specified time and the Epoch, \texttt{1970-01-01 00:00:00 +0000 (UTC)}.
See the following example. 

\begin{lstlisting}
$ date -d "2018-04-15 15:00:00" +%s
1523818800
\end{lstlisting}




% -------------------------------------------
% SUBSECTION
% ------------------------------------------- 
\subsection{Task 3: Measure the Entropy of Kernel}

In the virtual world, it is difficult to create randomness, i.e., software alone is hard to
create random numbers. Most systems resort to the physical world to gain the randomness. 
\texttt{Linux} gains the randomness from the following physical resources: 


\begin{lstlisting}
   void add_keyboard_randomness(unsigned char scancode);
   void add_mouse_randomness(__u32 mouse_data);
   void add_interrupt_randomness(int irq);
   void add_blkdev_randomness(int major);
\end{lstlisting}

The first two are quite straightforward to understand: the first
one uses the timing between key presses;  the second one
uses mouse movement and interrupt timing; the third one
gathers random numbers using the interrupt timing.  Of course,
not all interrupts are good sources of randomness.  
For example, the timer interrupt is not a good choice, because it 
is predictable. However, disk interrupts are a better measure.  
The last one measures the finishing time of block device requests.


The randomness is measured using {\em entropy}, which is different from 
the meaning of entropy in the information theory. Here, it simply 
means how many bits of random numbers the system currently has. 
You can find out how much
entropy the kernel has at the current moment using the following 
command. 

\begin{lstlisting}[backgroundcolor=]
$ cat /proc/sys/kernel/random/entropy_avail
\end{lstlisting}

Let us monitor the change of the entropy by 
running the above command via \texttt{watch}, which 
executes a program periodically, showing the output 
in fullscreen. The following command 
runs the \texttt{cat} program every \texttt{0.1}
second. 

\begin{lstlisting}
$ watch -n .1 cat /proc/sys/kernel/random/entropy_avail
\end{lstlisting}
 
Please run the above command. While it is running, move your mouse,
click your mouse, type somethings, read a large file, visit a website. 
What activities increases the entropy significantly. 
Please describe your observation 
in your report.




% -------------------------------------------
% SUBSECTION
% ------------------------------------------- 
\subsection{Task 4: Get Pseudo Random Numbers from \texttt{/dev/random}}


\linux stores the random data collected from the physical resources into 
a random pool, and then uses two devices to turn the randomness 
into pseudo random numbers. These two devices are \texttt{/dev/random} and 
\texttt{/dev/urandom}. They have different behaviors. The \texttt{/dev/random}
device is a blocking device. Namely, every time a random number is given out by 
this device, the entropy of the randomness pool will be 
decreased. When the entropy reaches zero, \texttt{/dev/random} will block,
until it gains enough randomness.


Let us design an experiment to observe the behavior of the \texttt{/dev/random} device.
We will use the \texttt{cat} command to keep reading pseudo random numbers from 
\texttt{/dev/random}. We pipe the output to \texttt{hexdump} for nice printing.


\begin{lstlisting}
$ cat /dev/random | hexdump
\end{lstlisting}


Please run the above command and at the same time use the \texttt{watch}
command to monitor the entropy. What happens if you do not move your mouse or type anything. 
Then, randomly move your mouse and see whether you can observe any difference. 
Please describe and explain your observations. 



\paragraph{Question:} If a server uses 
\texttt{/dev/random} to generate the random session key
with a client. Please describe how you can launch a 
Denial-Of-Service (DOS) attack on such a server. 



% -------------------------------------------
% SUBSECTION
% ------------------------------------------- 
\subsection{Task 5: Get Random Numbers from {\tt /dev/urandom}}

\linux provides another way to access the random pool via the {\tt
/dev/urandom} device, except that this device will not block.
Both {\tt /dev/random} and {\tt /dev/urandom} use the random data
from the pool to generate pseudo random numbers. When the entropy 
is not sufficient, {\tt /dev/random} will pause, while {\tt /dev/urandom} will
keep generating new numbers. Think of the data in the pool as 
the ``seed'', and as we know, we can use a seed to generate as many 
pseudo random numbers as we want. 

Let us see the behavior of \texttt{/dev/urandom}. 
We again use \texttt{cat} to get pseudo random
numbers from this device. Please run the following command, and 
the describe whether 
moving the mouse has any effect on the outcome. 

\begin{lstlisting}
$ cat /dev/urandom | hexdump
\end{lstlisting}



Let us measure the quality of the random number. We can use a tool called \texttt{ent}, which
has already been installed in our VM. 
According to its manual, ``\texttt{ent} applies various tests to sequences of bytes stored in files and
reports the results of those tests. The program is useful for evaluating pseudo-random number
generators for encryption and statistical sampling applications, compression algorithms, and
other applications where the information density of a file is of interest''.  Let
us first generate 1 MB of pseudo random number from \texttt{/dev/urandom} and 
save them in a file. Then we run \texttt{ent} on the file. Please describe your 
outcome, and analyze whether the quality of the random numbers is good or not. 


\begin{lstlisting}
$ head -c 1M /dev/urandom > output.bin
$ ent output.bin
\end{lstlisting}
 



Theoretically speaking, the
{\tt /dev/random} device is more secure, but in practice, 
there is not much difference, because the ``seed'' used by 
\texttt{/dev/urandom} is random and non-predictable~(\texttt{/dev/urandom} 
does re-seed whenever new random data become available). 
A big problem of the blocking behavior of \texttt{/dev/random} is that
blocking can lead to denial of service attacks.
Therefore, it is recommended that we use {\tt /dev/urandom} to get random 
numbers. To do that in our program, we 
just need to read directly from this device file. The following code 
snippet shows how.


\begin{lstlisting}
  #define LEN 16  // 128 bits

  unsigned char *key = (unsigned char *) malloc(sizeof(unsigned char)*LEN);
  FILE* random = fopen("/dev/urandom", "r");
  fread(key, sizeof(unsigned char)*LEN, 1, random);
  fclose(random);
\end{lstlisting}


Please modify the above code snippet to generate a 256-bit encryption key. Please 
compile and run your code; print out the numbers and 
include the screenshot in the report. 




% *******************************************
% SECTION
% *******************************************
\section{Submission}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\input{\commonfolder/submission}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\end{document}
